The content on your Web site may be the most important factor in improving your results in organic search marketing (and your Web site search, too). And many aspects of your content must be monitored–keyword density, titles, descriptions, and more. But it’s one thing to set standards in your organization and to train the people responsible, another to actually check that they comply.
Everyone tries to set some kinds of standards for their content—even if you just hang up a list of tips on the wall for you to remember when you write for your small Web site. For a larger site, your standards are more formal—you may even have committees to ratify changes for especially large sites.
Large or small, the standards say a lot of the same things when it comes to improving your content for search. Use your keywords in titles, descriptions, and body text. But how do you enforce those rules?
One way is to use a content reporter—software that creates a content scorecard that shows compliance of each page against your site’s standards. Your scorecard lists which pages contain titles and descriptions and which don’t—missing titles and descriptions are common searchability errors. With a little more work, you can check that each page has a unique title and description (not one used on other pages). With just a bit more work, you can check keyword density (the number of keywords compared to total words) for each page.
If you have less than 100 pages on the site, your scorecard can be a simple list of all pages with columns calling out whether titles and descriptions are present and are unique. With thousands of pages, you’ll want reports that roll up totals organizationally—”93% of the pages in the Consumer Products Division comply with standards, but 7% do not.”
A content reporter has three major parts:
- Spider. If you have a site search engine, it likely has a spider (crawler) that discovers all pages on your site. If you don’t have such a spider, you can try to customize a spider (such as Xenu) or you can build your own. Developing a spider is not an easy task, but you can check out Spidering Hacks to get some tips on how it’s done.
- Analyzer. After each page is crawled, you need software to pore over every tag and every word to test compliance to your standards. Your analyzer can make sure that the title and description tags have text. It can store the text from these tags in a database so that it can check that no other page contains the same text for those tags. If you carefully encode your top keywords (in order) in your keywords tag, your analyzer can check that those words are present in titles and descriptions—and can check the keyword density in the body text. Most site search engines have a content pipeline that allows you to insert your custom analyzer. One site search engine, IBM’s OmniFind, has an open framework called Unstructured Information Management Architecture (UIMA), that makes it easier to insert your analyzer.
- Scorecard. Once the analyzer has checked each page, you can use Crystal Reports or a similar program to produce your scorecard. Then, use “management by embarrassment” to go after the managers and writers of each group of pages that exhibits low compliance. Over time, they’ll correct their content if only to get good grades on the scorecard.
If you have a very large site, it might make sense for your to purchase and end-to-end system, such as the one from Watchfire—it’s pricey but it can be worth it for a site with thousands of pages. Regardless of whether you build or buy your content reporter, it’s a critical tool to improve content for searchability across your entire site.